How to Configure the Spellchecker Function
About Spellchecker
The spellchecker function enables the return of relevant results to queries that contain typos, misspellings, or when alternate spellings are available for query terms.
- By default, SmartHub provides an American English dictionary with approximately 100,000 words as well as smart suggestions based on your indexed data.
- The dictionary can be customized to fit your industry, company, and brand queries.
- The dictionary that is used by SmartHub for a specific query is driven by the language configured for the search page that is issuing the query.
- The default SmartHub dictionary is located in the <SmartHub root>\Dictionary directory
Supported search engines for smart suggestions
In SmartHub 7.0, the spelling suggestion mechanism has been upgraded to suggest spelling alternatives for misspelled personal names, company names, and specific terms that are relevant to the user, based on your indexed data. This functionality is only supported by the following search engines:
-
Azure AI Search
-
SharePoint Online
-
Elasticsearch
-
OpenSearch
About Using Single and Multiple Languages in SmartHub
The following information about languages and dictionaries in SmartHub is important to understand. Predefined languages are located in the <SmartHub_root>\js\cultures directory of your local SmartHub installation.
Single Language/Dictionary Setup
In a typical environment you use:
- 1 Language
- 1 Dictionary
- you can customize your dictionary by adding terms to it.
Multilingual/Dictionary Setup
In a multilingual environment, consider the following:
- To set up multiple languages, you define a set of languages: en-US, fr-FR, etc.
- You define which page matches which language.
- In this scenario you have multiple dictionaries
- One dictionary for each language you select
- If a dictionary for a language is missing, SmartHub defaults to English (en-US)
Changing Your Environment Language
- Changing your environment language is unnecessary but in special circumstances
- Consider languages other than the native language of your environment ONLY IF you set up a multilingual configuration and LanguageRedirects (via a custom settings file)
- The values you use under LanguageRedirects become the language of the page and the name of the dictionary
Language Example
If you edit any language files from the <SmartHub_root>\js\cultures directory, the language name appears and is independent of the language file name:
- fr.JS defines a language named fr-FR
- In your languageRedirects, in that case, you can use fr-FR so spellchecker looks for a dictionary named fr-FR.txt inside the SmartHub "Dictionary" directory.
Configuring SmartHub to Support Custom Dictionaries
Upgrading SmartHub overwrites all SmartHub files.
- Any changes made to the web.config file or the dictionary file must be backed-up, before SmartHub is upgraded.
- After SmartHub is upgraded, new dictionary files must be merged with old (if necessary), and restored.
To avoid this, perform the following steps:
- Navigate to your SmartHub installation directory.
- Clone the Dictionary folder and name a copy "CustomDictionary"
- Make sure that the dictionary files from the original Dictionary folder exist in the new Dictionary folder.
- Make sure that the dictionary files from the original Dictionary folder exist in the new Dictionary folder.
- Open IIS and expand the SmartHub site.
- Right-click on the SmartHub site and create a new Virtual Directory.
- Name it "Dictionary" and point it to the custom directory you created in step 2.
At this point you have your own copy of the dictionary without upgrades interfering with your customizations.
Upgrade note: Any additional words that SmartHub provides with new packages as part of the default dictionary will not appear in your dictionary copy. You need to add any new words to your copy of the dictionary.
Configure the SmartHub Spellchecker
UI Editor
The easiest and fastest way to insert and customize the spellchecker is through the SmartHub UI editor. To configure the spellchecker component, you edit the Advanced Settings file that you are using for your SmartHub pages. For more information about using Custom Settings, see How to Use the UI Editor. You can override any setting by adding it to the SH.SpellCheck.CustomSettings namespace.
Procedure
-
In the SmartHub Adminsitration portal, click UI Editor.
-
Click the Select a page link from the top menu.
-
Select an HTML page such as Results.html page.
-
BA Insight recommends you use page and folder to modify. Leave the default files as templates. For example, pages/CustomResults.html. The default Results.html page is under the top most SmartHub directory
-
-
Select the Advanced mode from the top right of the page.
-
Select Advanced settings edit.
-
Scroll down and locate the text "SH.SpellCheck.CustomSettings"
-
Click the See Default Settings link at the top right.
-
A new browser tab opens with all available SmartHub module settings.
-
Scroll down to the line that contains the text "SH.SpellCheck.DefaultSettings"
SH.SpellCheck = SH.SpellCheck || {};
SH.SpellCheck.DefaultSettings = {
"Enabled": "true",
"AutoRunSuggestion": true,
"MaxNumberOfSuggestions": 3,
"ParentSelector": ".bot-autocorrect-message-root",
"SuggestedQueryClass": "spellcheck-suggested-query",
"MaxCorrectionDistance": 2,
"AnalyticsFuzzyMatchAlgorithmId": '3',
"AutoCorrectTemplate": SH.RootLevelURL + "/modules/SpellCheck/AutoCorrectTemplate.html"
} -
Copy the settings under settings after the text "SH.SpellCheck.DefaultSettings = {".
-
Go back to your Advanced settings edit tab.
-
Paste the copied settings inside the section "SH.SpellCheck.CustomSettings = {".
-
Modify the settings as desired. See the table SpellCheck Settings below for more information.
-
Click the link "Preview <file>.html" at the top of the page.
-
Return to the previous page and make changes, if necessary. Repeat Step 15 until the desired results are achieved.
-
Click Save changes.
Example Configuration
SH.SpellCheck.CustomSettings = {
"Enabled": "true",
"AutoRunSuggestion": true,
"MaxNumberOfSuggestions": 3,
"ParentSelector": ".bot-autocorrect-message-root",
"SuggestedQueryClass": "spellcheck-suggested-query",
"MaxCorrectionDistance": 2,
"AnalyticsFuzzyMatchAlgorithmId": '3',
"AutoCorrectTemplate": SH.RootLevelURL + "/modules/SpellCheck/AutoCorrectTemplate.html"
}
SpellCheck Settings
The SpellCheck parameters and their values are listed below.
Setting |
Value |
Description |
---|---|---|
Enabled |
Boolean Default: true |
If this is set to "false", the component does not make any changes to SmartHub and does not load any dependency. |
AutoRunSuggestion |
Boolean Default: true |
When enabled, the highest score suggested query will be automatically run. |
MaxNumberOfSuggestions |
Integer Default: 3 |
This specifies the max number of suggestions that will be displayed as alternative spellings of the query. |
ParentSelector |
string Default: ".bot-autocorrect-message-root" |
This specifies the selector of HTML element under which suggestions will be displayed. |
SuggestedQueryClass |
String Default: "spellcheck-suggested-query" |
This specifies the class of HTML elements for suggestions that are displayed when "AutoRunSuggestion" is set to false. |
MaxCorrectionDistance |
Integer Default: 2 |
This is the max correction distance between the original and the corrected queries. For example, the distance between the original query "tst" and suggested query "test" is 2, but between "tst" and "quest" is 3, so only the first suggestion is accepted. |
AnalyticsFuzzyMatchAlgorithmId |
String Default: '3' |
This specifies the analytics algorithm id used for fuzzy match query, in case no suggestions are returned. |
AutoCorrectTemplate |
Boolean Default: SH.RootLevelURL + "/modules/SpellCheck/AutoCorrectTemplate.html" |
This specifies the path to the template used to display suggestions |
Adding New Words to SmartHub for Spell Checking
The dictionary used for spellchecking is selected based on the language code of your SmartHub search page.
- By default, the dictionary is set to en-US.txt.
- You can create your own dictionary but the dictionary name must match the language code to be used by SmartHub.
Dictionary Format and Term Frequency Settings
The dictionary contains:
- One line for each recognized term (word).
- Each line contains the term along with a number (between 1 and 9223372036854775807).
- The number represents the frequency of the term.
- Spell checking suggestions are computed in descending order by their frequency and the top 1 is returned to the UI.
- Since there could be multiple possible suggestions for a typo, you should ensure the most frequent spelling of a term has the highest number in the dictionary.
- For example, "SharePoint" could be:
- "share point"
- "sharp point"
or - "sharepoint"
- For example, "SharePoint" could be:
Sample of Lines in en-US Dictionary
How to Set the Most Frequent Words for Spelling Correction
Use a text editor to create new lines in the dictionary and assign a frequency number to them depending on how important/relevant they are in your environment.
Frequency number: Please note that the formula used to compute spelling suggestions is more complex than just numeric comparison between the frequencies but the general rule should be that the frequency of the words should directly correlate to "how many times would I find that word in the documents available at search time" - the higher the number the more relevant that word is
Adding Dictionaries for Other Languages to SmartHub for Spell Checking
As mentioned in the previous steps, the dictionary used for spell checking is selected depending on the language code of your SmartHub search page. The out-of-the-box the dictionary used is en-US.txt, but you can create your own dictionary as long as the name matches the language code.
You can download additional "base" dictionaries which contain general words and frequencies for a given language from open source repositories, such as: https://github.com/hermitdave/FrequencyWords/tree/master/content/2018.
Rename the dictionary file to "<language-code>.txt" (where the language code matches the language configured for your search page), and store it in the Dictionary folder.